B-23 

S \: 09 989.994 
8325-001 1.2 J 



PCX 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 7 
C07K 14/47, C12P 21/02 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



WO 00/27878 

18 May 2000 (18.05.00) 



(21) International Application Number: PCT/GB 99/0 3730 

(22) International Filing Date: 9 November 1999 (09.1 1.99) 



(30) Priority Data: 
9824544.2 



9 November 1998 (09.11. 98) GB 



(71) Applicant (for all designated States except US): GENDAQ 

LIMITED [GB/GB]; 1-3 Bunonhole Lane, Mill Hill, Lon- 
don NW7 IAD (GB). 

(72) Inventor, and 

(75) Inventor/Applicant (for US ontyh CHOO, Yen [CiK/GBj; 
MRC Laboratory of Molecular Biology, Medical Research 
Council Centre, Hills Road, Cambridge CB2 2QH (GB). 

(74) Agents: MASCHIO, Antonio et aL; D. Young & Co., 21 New 
Fetter Lane, London EC4A IDA (GB). 



(81) Designated States: AE, AL. AM, AT, AU. AZ, BA. BB. BG. 
BR, BY, CA. CH, CN, CU. CZ, DE. DK, EE, ES, FI, GB, 
GD, GE. GH, GM, HR, HU, ID. IL, IN, IS, JP. KE, KG, 
KP, KR. KZ, IX, LIC, LR. LS, LT, LU, LV, MD, MG, MK, 
MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG. 
- SI, SK, SL, TJ, TM, TR. TT. UA, UG, US, UZ, VN, YU, 
ZA. ZW, ARIPO patent (GH, GM, KE, LS, MW, SD, SL, 
SZ, TZ, UG, ZW), Eurasian patent (AM. AZ. BY, KG. KZ, 
MD, RU, TJ, TM), European patent (AT. BE, CH, CY. DE. 
DK, ES. FI, FR. GB, GR, IE, IT. LU. MC. NL, PT, SE), 
OAPI patent (BF, BJ, CF, CG, CI, CM, GA. GN, GW, ML, 
MR, NE. SN, TD, TG). 



ruoiuuui 



With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: SCREENING SYSTEM FOR ZINC FINGER POLYPEPTIDES FOR A DESIRED BINDING ABILITY 



(57) Abstract 

The invention relates to a method for producing a zinc finger nucleic acid binding protein comprising preparing a zinc finger protein 
according design rules, varying the protein at one or more positions, and selecting variants which bind to a target nucleic acid sequence by 
polysome display. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



A.L 


Albania 


ES 


Spam 


LS 


Lesotho 


SI 


Slovenia 


KM 


Armenia 


Ft 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


Prance 


Lt 


Luxembourg 


SN 


Senegal 


AU 


Australia 


CA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


CB 


United Kingdom 


MC 


Monaco 


TO 


Chad 


BA 


Bosnia and Hercegovmi 


CE 


Georga 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


CH 


Ghana 


MC 


M*4>gurar 


TJ 


Tajik wan 


BE 


Belg mm 


GN 


Guinea 


MK 


The farmer Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Fuo 


CR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BC 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Tnnidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


bracl 


MR 


Mauritania 


UC 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of Amcnca 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Ntger 


VN 


Viet Nam 


cc 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


SwuxrrUnd 


KC 


KyTgyzstiA 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cote d'lvowt 


KP 


Democratic People' i 


NZ 


New Zealand 






CM 


Carncroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


FT 


Portugal 








' it* 


v. :* 






B nrnani* 












u • 
















[V 1 ■ >: 




■■-"TV ■ 







1 1 :* 



WO 00/27878 



PCT/CB99/03730 



SCREENING SYSTEM FOR ZINC FINGER POLYPEPTIDES FOR A DESIRED BINDING 

ABILITY 

The present application relates to a method for screening zinc finger polypeptides for a 
desired binding ability In particular, the invention relates to a polysome display 
5 technique which permits the isolation of binding polypeptides without resorting to 
phage display techniques. 

Protein-nucleic acid recognition is a commonplace phenomenon which is central to a 

larop number of biomolecular control mechanisms which regulate the functioning of 

— j - 

10 eukaryotic and prokaryotic cells. For instance, protein-DNA interactions form the 
basis of the regulation of gene expression and are thus one of the subjects most widely 
studied by molecular biologists. 

A wealth of biochemical and structural information explains the details of protein-DNA 
15 recognition in numerous instances, to the extent that general principles of recognition 
have emerged. Many DNA-binding proteins contain independently folded domains for 
the recognition of DNA, and these domains in turn belong to a large number of 
structural families, such as the leucine zipper, the "helix-turn-helix " and zinc finger 
families. 

20 

Most sequence -specific DNA-binding proteins bind to the DNA double helix by inserting 
an a-helix into the major groove (Pabo & Sauer 1992 Annu. Rev. Biochcm. 61, 
1053-1095; Hamson 1991 Nature (London) 353, 715-719; and Klug 1993 Gene 135, 
83-92). Sequence specificity results from the geometrical and chemical complementarity 

25 between the amino acid side chains of the a-helix and the accessible groups exposed on 
the edges of base-pain. In addition to thus direct readme of the DNA sequence, 
interactions with the DNA backbone stabilise the complex and are sensitive to the 
conformauon of the nucleic acid, which in turn depends on the base sequence (Dickerson 
&. Drew 1981 J. Mol. Biol. 149, 761-786. Crystal structures of protein-DNA complexes 

30 have shown that proteins can be idiosyncratic in their mode of DNA recognition, at least 
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DNA. allowing a variery of different base contacts to be made by a single amino acid and 
vice versa (Matthews 1988 Nature (London) 335. 294-295). 

Protein engineering experiments have shown that it is possible to alter rationally the 
5 DNA-binding characteristics of individual zmcTmgers when one or more of the a-helical 
positions is varied in a number of proteins (Nardeili et a/., 1991 Nature (London) 349, 
175-178; Nardeili et at.. 1992 Nucleic Acids Res. 20, 4137^144; and Desjariais & Berg 
1992a Proteins 13, 272). It has already been possible to propose some principles relating 
amino acids on the a-hciix to corresponding bases in the bound DNA sequence 
10 (Desjariais 8c Berg 1992b Proc. Natl. Acad. Sci. USA 89. 7345-7349). However in this 
approach the altered positions on the a-helix are prejudged, making it possible to 
overlook the role of positions which are not currently considered imponant; and secondly, 
owing to the importance of context, concomitant alterations are sometimes required to 
affect specificity (Desjariais & Berg 1992b), so that a significant correlation between an 
15 ammo acid and base may be misconstrued. 

More sophisticated principles describing the relationship between the sequence of the zinc 
finger and the nucleic acid target have been described, for example in WO 96/06166 
(Medical Research Council). 

20 

To investigate binding of mutant Zf proteins, Thiesen and Bach (1991 FEBS 283, 23-26) 
mutated Zf fingers and studied their binding to randomised oligonucleotides, using 
eiectrophoretic mobility shift assays. Subsequent use of phage display technology has 
permitted the expression of random libraries of Zf mutant proteins on the surface of 
25 bacteriophage. The three Zf domains of Zif268. with 4 positions within finger one 
randomised, have been displayed on the surface of filamentous phage by Rebar and Pabo 
(1994 Science 263, 671-673). The library was then subjected to rounds of affinity 
selection by binding to target DNA oligonucleotide sequences in order to obtain Zf 
proteins with new binding specificities. Randomised mutagenesis (at the same postions as 
-*n*p Peered *v Rehsr & Pabc^ of fmser ! of Zif 268 with phaee displav has also been 



-J 



WO 00/27878 PCT/GB99/03730 

s> 

used by Jamieson ei aL. (1994 Biochemistry 33. 5689-5695) to create novel binding 
specificity and affinity. 



In summary, it is known that Zf protein motifs are widespread in DNA binding proteins 
5 and that binding is via three key amino acids, each one contacting a single base pair in the 
target DNA sequence. Motifs are modular and may be linked together to form a set of 
fingers which recognise a contiguous DNA sequence (e.g. a three fingered protein will 
recognise a 9mer etc). The key residues involved in DNA binding have been identified 
through sequence data and from structural injonmnon. Directed and random 
10 mutagenesis has confirmed the roie of these amino acids in determining specificity and 
affinity. Phage display has been used to screen for new binding specificities of random 
mutants of fingers. Therefore, the combination of a set of rules with a selection process 
appears to provide the most promising avenue for the development of zinc finger proteins. 



15 Summary of the Invention 



According to a first aspect of the present invention, there is provided a method for 
producing a zinc finger nucleic acid binding protein comprising preparing a zinc finger 
protein according design rules, varying the protein at one or more positions, and 
20 selecting variants which bind to a target nucleic acid sequence by polysome display. 

According to a second aspect of the present invention, there is provided a method for 
producing a zinc finger nucleic acid binding protein comprising an at least partially 
varied sequence and selecting variants thereof which bind to a target DNA strand, 
25 comprising the steps of: 

(i) preparing a nucleic acid binding protein of the Cys2-His2 zinc finger class 
capable of binding to a nucleic acid triplet in a target nucleic acid sequence, wherein 
binding to each base of the triplet by an a-hehcal zinc finger nucleic acid binding motif 
" n in the protein is determined as follows- 
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a) if the 5' base m the triplet is G. then position +6 in the a-helix is Arg; or position 
-r6 is Ser or Thr and position -r + 2 is Asp; 

b) if the 5' base in the triplet is A. then position +6 in the a-helix is Gin and + ^2 is 

not Asp; 

c) if the 5' base in the triplet is T. then position +6 in the a-helix is Ser or Thr and 

position + +2 is Asp; 

d) if the 5' base in the triplet is C. then position +6 in the a-helix may be any ammo 
acid, provided that position + +2 in the a-helix is not Asp; 

e) it the central base m the triplet is G. then position +3 in the a-helix is His; 

f) if the central base in the triplet is A. then position +3 in the a-helix is Asn; 

«) if the central base in the triplet is T. then position +3 in the a-helix is Ala. Ser or 
Val; provided that if it is Ala, then one of the residues at -1 or +6 is a small residue; 
h) if the central base in the triplet is C. then position +3 in the a-helix is Ser, Asp, 

Glu. Leu, Thr or Val; 
15 i) if the 3' base in the triplet is G. then position -1 in the a-helix is Arg; 
j) if the 3* base in the triplet is A. then position -1 in the a-helix is Gin; 
k) if the 3* base in the triplet is T. then position -1 in the a-helix is Asn or Gin; 
I) if the 3' base in the triplet is C. then position -1 in the a-helix is Asp; 



10 



20 (ii) v 



arying the resultant polypeptide at at least one position; and 



(iii) selecting the variants which bind to a target nucleic acid sequence by polysome 
display. 



25 Detailed Desc riphon nf rhe Invention 

All of the nucleic acid-binding residue positions of zinc fingers, as referred to herein, 
numbered from the first residue in the a-helix of the finger, ranging from + 1 to 
9. "-P refers to the residue in the framework structure immediately preceding the 
■ helir in a Cvs2-His2 zinc finser polypeptide Residues referred to as " + + ' are 



are 
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5 

residues present in an adjacent (C-terminal) finger. Where there is no C-terrrunal 
adjacent finger. + + interactions do not operate. 

Cys2-His2 zinc finger binding proteins, as is well known in the art, bind to target 
5 nucleic acid sequences via a-heiicai zinc metaFatom co-ordinated binding motifs known 
as zinc fingers. Each zinc finger in a zinc finger nucleic acid binding protein is 
responsible for determining binding to a nucleic acid triplet in a nucleic acid binding 
sequence. Preferably, there are 2 or more zinc fingers, for example 2, 3, 4, 5 or 6 zinc 
fingers, in each binding protein. Advantageously, there are 3 zinc fingers in each zinc 
10 finger binding protein. 

The method of the present invention allows the production of what are essentially 
artificial nucleic acid binding proteins. In these proteins, artificial analogues of amino 
acids may be used, to impart the proteins with desired properties or for other reasons. 

15 Thus, the term "amino acid", particularly in the context where "any amino acid" is 
referred to, means any sort of natural or artificial amino acid or amino acid analogue 
that may be employed in protein construction according to methods known in the art. 
Moreover, any specific ammo acid referred to herein may be replaced by a functional 
analogue thereof, particularly an artificial functional analogue. The nomenclature used 

20 herein therefore specifically comprises within its scope functional analogues of the 
defined amino acids. 

The a-helix of a zinc finger binding protein aligns antiparallel to the nucleic acid 
strand, such that the primary nucleic acid sequence is arranged 3' to 5' in order to 

25 correspond with the N terminal to C-terminal sequence of the zinc finger. Since 
nucleic acid sequences are conventionally written 5' to 3', and amino acid sequences N- 
terminus to C-termmus, the result is that when a nucleic acid sequence and a zinc finger 
protein are aligned according to convention, the primary interaction of the zinc finger is 
with the - strand of the nucleic acid, since it is this strand which is aligned 3' to 5'. 

i0 These conventions are followed in the nomenclature used herein. It should be noted, 
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+ strand of nucleic acid: see Suzuki el ai. (1994) NAR 22:3397-3405 and Pavletich 
and Pabo, (1993) Sc.ence 261:1701-1707. The incorporation or such fingers into 
nucleic acid binding molecules according to the invention is envisaged. 

A zinc finger binding motif is a structure well-known to those in the art and defined in. 
for example. Miller et ai. (1985) EMBO J. 4:1609-1614; Berg (1988) PNAS (USA) 
85:99-102; Lee a al.. (1989) Science 245:635-637; see International patent applications 
WO 96/06166 and WO 96/32475, corresponding to USSN 08/422.107. incorporated 
herein by reference. 

As used herein, -nucleic acid" refers to both RNA and DNA. constructed from natural 
nucleic acid bases or synthetic bases, or mixtures thereof. Preferably, however, the 
binding proteins of the invention are DNA binding proteins. 

The structure of the framework of Cys2-His2 zinc fingers is known in the art. The 
present invention encompasses both those structures which have been observed in 
nature, including consensus structures derived therefrom, and artificial structures which 
have non-natural residue numbers and spacing but which retain the functionality of a 
zinc finger. 

In general, a preferred zinc finger framework has the structure: 



(A) X 0 .j C X x . 5 C X,. 14 H Xj. 6 H / c 



where X is any amino acid, and the numbers in subscript indicate the possible numbers 
of residues represented by X. 

In a preferred aspect of the present invention, zinc finger nucleic acid binding motifs 
may be represented as motifs having the following primary structure: 
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(B) X* C X,. 4 C X 2 . 3 FX^XXXXLXXHXXX'H - linker 

-1 123456789 



wherein X (including X J , X b and X c ) is. any ammo acid. X M and X 20 refer to the 
presence of 2 or 4. or 2 or 3, amino acids, respectively. The Cys and His residues, 
which together co-ordinate the zinc metal atom, are marked in bold text and are usually 
invariant, as is the Leu residue at position +4 in the a-helix. 

Modifications to this representation may occur or be effected without necessarily 
abolishing zinc finger function, by insertion, mutation or deletion of amino acids. For 
example it is known that the second His residue may be replaced by Cys (Kiizek ei ai . 
(1991) J. Am. Chem. Soc. 113:4518-4523) and that Leu at +4 can in some 
circumstances be replaced with Arg. The Phe residue before X c may be replaced by 
any aromatic other than Trp. Moreover, experiments have shown that departure from 
the preferred structure and residue assignments for the zinc finger are tolerated and may 
even prove beneficial in binding to certain nucleic acid sequences. Even taking this 
into account, however, the general structure involving an a-helix co-ordinated by a zinc 
atom which contacts four Cys or His residues, does not alter. As used herein, 
structures (A) and (B) above are taken as an exemplary structure representing all zinc 
finger structures of the Cys2-His2 type. 

Preferably, X* is %-X or P- F / Y -X. In this context, X is any ammo acid. Preferably, 
in this context X is E, K, T or S. Less preferred but also envisaged are Q, V, A and P. 
The remaining ammo acids remain possible. 

Preferably, X : _, consists of two amino acids rather than four. The first of these amino 
acids may be any amino acid, but S, E, K, T, P and R are preferred. Advantageously, 
it is P or R. The second of these ammo acids is preferably E. although any amino acid 
mav be used. 
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Preferably, X L is S or T. 

Preferably. X 20 is G-K-A, G-K-C, G-K-S or G-K-G. However, departures from the 
5 preferred residues are possible, for example in -the form of M-R-N or M-R. 

Preferably, the linker is T-G-E-K or T-G-E-K-P. 

As set out above, the major binding interactions occur with, amino acids -1. +3 and 
10 r6. Amino acids +4 and +7 are largely invariant. The remaining amino acids may 
be essentially any ammo acids. Preferably, position +9 is occupied by Arg or Lys. 
Advantageously, positions +1, +5 and +8 are not hydrophobic amino acids, that is to 
say are not Phe, Trp or Tyr. Preferably, position +2 is any amino acid, and preferably 
serine, save where its nature is dictated by its role as a ++2 amino acid for an N- 
15 terminal zinc finger in the same nucleic acid binding molecule. 

In a most preferred aspect, therefore, bringing together the above, the invention allows 
the definition of every residue in a zinc finger nucleic acid binding motif which will 
bind specifically to a given nucleic acid triplet. 

20 

The code provided by the present invention is not entirely rigid; certain choices are 
provided. For example, positions +1, +5 and +8 may have any amino acid 
allocation, whilst other positions may have certain options: for example, the present 
rules provide that, for binding to a central T residue, any one of Ala, Ser or Val may 
25 be used at +3. In its broadest sense, therefore, the present invention provides a very 
large number of proteins which are capable of binding to every defined target nucleic 
acid triplet. As set forth below, these protein may be selected for binding ability using 
polysome display techniques. 



Preferablv however the number of possibilities mav be significantly reduced before 
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the residues Lys, Thr and Gin respectively as a default option. In the case of the other 
choices, for example, the first-given option may be employed as a default. Thus, the 
code according to the present invention allows the design of a simile, defined 
polypeptide (a -default" polypeptide) which will bind to its target triplet. 

5 

In a further aspect of the present invention, there is provided a method for preparing a 
nucleic acid binding protein of the Cys2-His2 zinc finger class capable of binding to a 
target nucleic acid sequence, comprising the steps of: 

10 a) selecting a model zinc finger domain from the group consisting of naturally 
occurring zinc fingers and consensus zinc fingers; 

b) varying at least one of positions -1, +3, +6 (and ++2) the finger as required 
according to the rules set forth above; and 

15 

c) selecting the variants which bind to the target nucleic acid by polysome diaplay. 

In general, naturally occurring zinc fingers may be selected from those fingers for 
which the nucleic acid binding specificity is known. For example, these may be the 
20 fingers for which a crystal structure has been resolved: namely Zif 268 (Elrod-Erickson 
et aL, (1996) Structure 4:1171-1180), GLI (Pavletich and Pabo, (1993) Science 
261:1701-1707), Tramtrack (Fairall et a/.. (1993) Nature 366:483-487) and YY1 
(Houbaviy et a/., (1996) PNAS (USA) 93:13577-13582). 

25 The naturally occurring zinc finger 2 in Zif 268 makes an excellent starting point from 
which to engineer a zinc finger and is preferred. 

Consensus zinc finger structures may be prepared by comparing the sequences of 
known zinc fingers, irrespective of whether their binding domain is known. Preferably, 
the consensus structure is selected from the group consisting of the consensus structure 



10 
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PYKCPECGKSFSQKSDLVKHQRTHTG. and the consensus 
structure PYKCSECGKAFSQKSNLTRHQRIHTGEKP. 

The consensuses are derived from the consensus provided by Knzek et at., (1991) J. 
Am. Chem. Soc. 113:4518-4523 and from Jacobs, (1993) PhD thesis. University of 
Cambridge. UK. In both cases, the linker sequences described above for joining two 
zinc finger motifs together, namely TGEK or TGEKP can be formed on the ends of the 
consensus. Thus, a P may be removed where necessary, or, in the case of the 
consensus terminating T G, E K (P) can be added. 



When the nucleic acid specificity of the model finger selected is lenown. the mutation of 
the finger in order to modify its specificity to bind to the target nucleic acid may be 
directed to residues known to affect binding to bases at which the natural and desired 
targets differ. Otherwise, mutation of the model fingers should be concentrated upon 
15 residues -1, +3, +6 and + + 2 as provided for in the foregoing rules. 

In order to produce a binding protein having improved binding, moreover, the rules 
provided by the present invention may be supplemented by physical or virtual 
modelling of the protein/nucleic acid interface in order to assist in residue selection. 

20 

Zinc finger binding motifs designed according to the invention may be combined into 
nucleic acid binding proteins having a multiplicity of zinc fingers. Preferably, the 
proteins have at least two zinc fingers. In nature, zinc finger binding proteins 
commonly have at least three zinc fingers, although two- zinc finger proteins such as 

25 Tramtrack are known. The presence of at least three zinc fingers is preferred. Binding 
proteins may be constructed by joining the required fingers end to end, N-terminus to 
C-terminus. Preferably, this is effected by joining together the relevant nucleic acid 
coding sequences encoding the zinc fingers to produce a composite coding sequence 
encoding the entire binding protein. The invention therefore provides a method for 

30 producing a nucleic acid binding protein as defined above, wherein the nucleic acid 
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binding protein is constructed by recombinant DNA technology, the method comprising 
the steps of: 

a) preparing a nucleic acid coding sequence encoding two or more zinc finger binding 
motifs as defined above, placed N-terminus-to C-termtnus; 

b) inserting the nucleic acid sequence into a suitable expression vector; and 

c) expressing the nucleic acid sequence in a host organism in order to obtain the nucleic 
acid binding protein. 

A "leader" peptide may be added to the N-termmai finger. Preferably, the leader 
peptide is MAEEKP. 

The nucleic acid encoding the nucleic acid binding protein according to the invention 
can be incorporated into vectors for further manipulation. As used herein, vector (or 
plasmid) refers to discrete elements that are used to introduce heterologous nucleic acid 
into cells for either expression or replication thereof. Selection and use of such vehicles 
are well within the skill of the person of ordinary skill in the an. Many vectors are 
available, and selection of appropriate vector will depend on the intended use of the 
vector, i.e. whether it is to be used for DNA amplification or for nucleic acid 
expression, the size of the DNA to be inserted into the vector, and the host cell to be 
transformed with the vector. Each vector contains various components depending on its 
function (amplification of DNA or expression of DNA) and the host cell for which it is 
compatible. The vector components generally include, but are not limited to, one or 
more of the following: an origin of replication, one or more marker genes, an enhancer 
element, a promoter, a transcription termination sequence and a signal sequence. 

Both expression and cloning vectors generally contain nucleic acid sequence that enable 
the vector to replicate in one or more selected host cells. Typically in cloning vectors, 
this sequence is one that enables the vector to replicate independently of the host 
^^nnrnT^! HN a 'nclude< orieins of replication or autonomously replicating 
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The origin of replication from the plasmid pBR322 is suitable for most Gram-negative 
bacteria, the 2|a plasmid origin is suitable for yeast, and various viral origins (e.g. SV 
40. polyoma, adenovirus) are useful for cloning vectors in mammalian cells. Generally, 
the origin of replication component is not needed for mammalian expression vectors 
unless these are used in mammalian cells competent for high level DNA replication, 
such as COS cells. 

Most expression vectors are shuttle vectors, i.e. they are capable of replication in at 
least one class of organisms but can be iranMccied into another organism for 
expression. For example, a vector is cloned in E. coli and then the same vector is 
transfected into yeast or mammalian cells even though it is not capable of replicating 
independently of the host cell chromosome. DNA may also be replicated by insertion 
into the host genome. However, the recovery of genomic DNA encoding the nucleic 
acid binding protein is more complex than that of exogenously replicated vector because 
restriction enzyme digestion is required to excise nucleic acid binding protein DNA. 
DNA can be amplified by PCR and be directly transfected into the host cells without 
any replication component. 

Advantageously, an expression and cloning vector may contain a selection gene also 
referred to as selectable marker. This gene encodes a protein necessary for the survival 
or growth of transformed host cells grown in a selective culture medium. Host cells not 
transformed with the vector containing the selection gene will not survive in the culture 
medium. Typical selection genes encode proteins that confer resistance to antibiotics 
and other toxins, e.g. ampicillin. neomycin, methotrexate or tetracycline, complement 
auxotrophic deficiencies, or supply critical nutrients not available from complex media. 

As to a selective gene marker appropriate for yeast, any marker gene can be used which 
facilitates the selection for transformants due to the phenotypic expression of the 
marker gene. Suitable markers for yeast are. for example, those conferring resistance to 

.,.:K,^ t ,-^ r t ji<? ^vopnivrin nr Heomvcin or provide for prototrophy in an 
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S.nce the replication of vectors is conveniently done in E. colt, an E. coli genetic 
marker and an E. coli or.g.n of replication are advantageously included. These can be 
obtained from £. coli plasmids. such as pBR322, Bluescript* vector or a pUC plasmid, 
e.g. pUC18 or pUC19. which contain both E. coli replication origin and E. coli genetic 
marker conferring resistance to antibiotics, such as ampicillin. 

Suitable selectable markers for mammalian cells are those that enable the identification 
of cells competent to take up nucleic acid bindiug protein nucleic acid, such as 
dihydrofolate reductase (DHFR. methotrexate resistance), thymidine kinase, or genes 
conferring resistance to G418 or hygromycin. The mammalian cell transfonnants are 
placed under selection pressure which only those transformants which have taken up 
and are expressing the marker are uniquely adapted to survive. In the case of a DHFR 
or glutamine synthase (GS) marker, selection pressure can be imposed by cultunng the 
transformants under conditions in which the pressure is progressively increased, 
thereby leading to amplification (at its chromosomal integration site) of both the 
selection gene and the linked DNA that encodes the nucleic acid binding protein. 
Amplification is the process by which genes in greater demand for the production of a 
protein critical for growth, together with closely associated genes which may encode a 
desired protein, are reiterated in tandem within the chromosomes of recombinant cells. 
Increased quantities of desired protein are usually synthesised from thus amplified 
DNA. 

Expression and cloning vectors usually contain a promoter that is recognised by the 
host organism and is operably linked to nucleic acid binding protein encoding nucleic 
acid. Such a promoter may be inducible or constitutive. The promoters are operably 
linked to DNA encoding the nucleic acid binding protein by removing the promoter 
from the source DNA by restriction enzyme digestion and inserting the isolated 
promoter sequence into the vector. Both the native nucleic acid binding protein 

•h m W v hrtrr^lneous promoters mav be used to direct 
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Promocers suitable for use with prokaryouc hosts include, for example, the P-tactamase 
and lactose promoter systems, alkaline phosphatase, the tryptophan (trp) promoter 
system and hybrid promoters such as the tac promoter. Their nucleotide sequences have 
5 been published, thereby enabling the skilled worker operably to ligate them to DNA 
encoding nucleic acid binding protein, using linkers or adapters to supply any required 
restriction sites. Promoters for use in bacterial systems will also generally contain a 
Shine-Delgarno sequence operably linked to the DNA encoding the nucleic acid binding 
protein. 

10 

Preferred expression vectors are bacterial expression vectors which comprise a 
promoter of a bacteriophage such as phagex or T7 which is capable of functioning in 
the bacteria. In one of the most widely used expression systems, the nucleic acid 
encoding the fusion protein may be transcribed from the vector by T7 RNA polymerase 

15 (Srudier et al, Methods in Enzymoi. 185; 60-89, 1990). In the £. coli BL21(DE3) 
host strain, used in conjunction with pET vectors, the T7 RNA polymerase is produced 
from the X-lysogen DE3 in the host bacterium, and its expression is under the control 
of the IPTG inducible lac UV5 promoter. This system has been employed successfully 
for over-production of many proteins. Alternatively the polymerase gene may be 

20 introduced on a lambda phage by infection with an int- phage such as the CE6 phage 
which is commercially available (Novagen, Madison, USA), other vectors include 
vectors containing the lambda PL promoter such as PLEX (Invitrogen, NL) , vectors 
containing the trc promoters such as pTrcHisXpressTm (Invitrogen) or pTrc99 
(Pharmacia Biotech, SE) or vectors containing the tac promoter such as pKK223-3 

25 (Pharmacia Biotech) or PMAL (New England Biolabs, MA, USA). 

Moreover, the nucleic acid binding protein gene according to the invention preferably 
includes a secretion sequence in order to facilitate secretion of the polypeptide from 
bacterial hosts, such that it will be produced as a soluble native peptide rather than in 
in nn inclusion bodv The peptide mav be recovered from the bacterial periplasnuc space, 
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Suitable promoting sequences for use with yeast hosts may be regulated or constitutive 
and are preferably derived from a highly expressed yeast gene, especially a 
Saccharomyces cerevisiae gene. Thus, the promoter of the TRP1 gene, the ADHI or 

5 ADHII gene, the acid phosphatase (PH05) gene, a promoter of the yeast mating 
pheromone genes coding for the a- or a-factor or a promoter derived from a gene 
encoding a glycolytic enzyme such as the promoter of the enolase, glyceraldehyde-3- 
phosphate dehydrogenase (GAP), 3-phospho glycerate kinase (PGK), hexokinase, 
pyruvate decarboxylase, phosphofructokinase, giucose-6-phosphate isomerase, 3- 

10 phosphogly cerate mutase, pyruvate kinase, triose phosphate isomerase. phosphoglucose 
isomerase or glucokinase genes, or a promoter from the TATA binding protein (TBP) 
oene can be used. Furthermore, it is possible to use hybrid promoters comprising 
upstream activation sequences (UAS) of one yeast gene and downstream promoter 
elements including a functional TATA box of another yeast gene, for example a hybrid 

15 promoter including the UAS(s) of the yeast PH05 gene and downstream promoter 

elements including a functional TATA box of the yeast GAP gene (PH05-GAP hybrid 
promoter). A suitable constitutive PH05 promoter is e.g. a shortened acid phosphatase 
PH05 promoter devoid of the upstream regulatory elements (UAS) such as the PH05 (- 
173) promoter element starting at nucleotide -173 and ending at nucleotide -9 of the 

20 PH05 gene. 

Nucleic acid binding protein gene transcription from vectors in mammalian hosts may 
be controlled by promoters derived from the genomes of viruses such as polyoma virus, 
adenovirus, fowlpox virus, bovine papilloma virus, avian sarcoma virus, 
25 cytomegalovirus (CMV), a retrovirus and Simian Virus 40 (SV40), from heterologous 
mammalian promoters such as the actin promoter or a very strong promoter, e.g. a 
ribosomai protein promoter, and from the promoter normally associated with nucleic 
acid binding protein sequence, provided such promoters are compatible with the host 
cell systems. 
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Transcription of a DNA encoding nucleic ac.d b.nd.ng protein by higher eukaryotes 
may be increased by inserting ™ enhancer sequence into the vector. Enhancers are 
relatively orientation and position independent. Many enhancer sequences are known 
trom mammalian genes (e.g. elastase and globin). However, typically one will employ 
an enhancer from a eukaryotic cell v.rus. Examples include the SV40 enhancer on the 
late side of the replication origin (bp 100-270) and the CMV early promoter enhancer. 
The enhancer may be spl.ced into the vector at a position 5' or 3" to nucleic acid 
binding protein DNA. but is preferably located at a s.te 5" from the promoter. 

Advantageously, a eukaryot.c expression vector encoding a nucleic acid binding protein 
according to the invention may comprise a locus control region (LCR). LCRs are 
capable of directing high-level integration site independent expression of transgenes 
integrated into host cell chromattn, which is of importance especially where the nucleic 
acid binding protein gene is to be expressed in the context of a permanently-transfected 
eukaryotic cell line in which chromosomal integration of the vector has occurred, or in 
transgenic animals. 

Eukaryotic vectors may also contain sequences necessary for the termination of 
transcription and for stabilising the mRNA. Such sequences are commonly available 
from the 5* and 3' untranslated regions of eukaryotic or viral DNAs or cDNAs. These 
regions contain nucleotide segments transcribed as polyadenylated fragments in the 
untranslated portion of the mRNA encoding nucleic acid binding protein. 

An expression vector includes any vector capable of expressing nucleic acid binding 
protein nucleic acids that are operatively linked with regulatory sequences, such as 
promoter regions, that are capable of expression of such DNAs. Thus, an expression 
vector refers to a recombinant DNA or RNA construct, such as a plasmid. a phage, 
recombinant vims or other vector, that upon introduction into an appropriate host cell, 
results in expression of the cloned DNA Appropriate expression vectors are well 
known to those with ordmarv skill in the art and include those that are replicable in 
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integrate into the host ceil genome. For example, DNAs encoding nucleic acid binding 
proiein may be inserted into a vector suitable for expression of cDNAs in mammalian 
cells, e.g. a CMV enhancer-based vector such as pEVRF (Matthias, et al.. (1989) NAR 
17, 6418). 

Particularly useful for practising the present invention are expression vectors that 
provide for the transient expression of DNA encoding nucleic acid binding protein in 
mammalian cells. Transient expression usually involves the use of an expression vector 
ic ,hir rn renlicate efficiently in a host cell, such that the host cell accumulates 

many copies of the expression vector, and, in turn, synthesises high levels of nucleic 
add binding proiein. For the purposes of the present invention, transient expression 
systems are useful e.g. for identifying nucleic acid binding protein mutants, to identify 
potential phosphorylation sues, or to characterise functional domains of the protein. 

Consmicuon of vectors according to the invention employs conventional ligation 
techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in 
the form desired to generate the plasmids required. If desired, analysis to confirm 
correct sequences in the constructed plasmids is performed in a known fashion. Suitable 
methods for constructing expression vectors, preparing in vitro transcripts, introducing 
DNA into host cells, and performing analyses for assessing nucleic acid binding protein 
expression and function are known to those skilled in the an. Gene presence, 
amplification and/or expression may be measured in a sample directly, for example, by 
conventional Southern blotting, Northern blotting to quantitate the transcription of 
rnRNA. dot blotting {DNA or RNA analysis), or in situ hybridisation, using an 
appropriately labelled probe which may be based on a sequence provided herein. Those 
skilled in the an will readily envisage how these methods may be modified, if desired. 

In accordance with another embodiment of the present invention, there are provided 
cells containing the above -described nucleic acids. Such host cells such as prokaryote, 
veast and higher eukaryote cells may be used for replicating DNA and producing the 
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negative or Gram-positive organisms, such as £. coli. e.g. E. coU K-12 strains. DHia 
and HB101. or Bacilli. Further hosts suitable for the nucleic acid binding protein 
encoding vectors include eukaryotic microbes such as filamentous fungi or yeast, e.g. 
Saccharomyces cerevisiae. Higher eukaryotic cells include insect and vertebrate cells, 
particularly mammalian cells including human cells, or nucleated cells from other 
multicellular organisms. In recent years propagation of vertebrate cells in culture (tissue 
culture) has become a routine procedure. Examples of useful mammalian host cell lines 
are epithelial or fibroblastic cell lines such as Chinese hamster ovary (CHO) cells. NIH 
3T3 cells, HeLa ceils or 293T cells. The host cells referred to in this disclosure 
comprise cells in in vitro culrure as well as cells that are within a host animal. 



DNA may be stably incorporated into cells or may be transiently expressed using 
methods known in the art. Stably transfected mammalian cells may be prepared by 
transfecting cells with an expression vector having a selectable marker gene, and 
15 growing the transfected cells under conditions selective for cells expressing the marker 
gene. To prepare transient transfectants. mammalian cells are transfected with a 
reporter gene to monitor transfection efficiency. 

To produce such stably or transiently transfected cells, the cells should be transfected 
20 with a sufficient amount of the nucleic acid binding protein-encoding nucleic acid to 
form the nucleic acid binding protein. The precise amounts of DNA encoding the 
nucleic acid binding protein may be empirically determined and optimised for a 
particular cell and assay . 

25 Host cells are transfected or. preferably, transformed with the above-captioned 

expression or cloning vectors of this invention and cultured in conventional nutrient 
media modified as appropriate for inducing promoters, selecting transformants, or 
amplifying the genes encoding the desired sequences. Heterologous DNA may be 
introduced into host cells by any method known in the art. such as transfection with a 
m vector encodino a heterologous DNA by the calcium phosphate coprecipitation 
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skilled worker in the field. Successful transfecuon is generally recognised when any 
indication of the operation of this vector occurs in the host ceil. Transformation is 
achieved using standard techniques appropriate to the particular host ceils used. 

incorporation of cloned DNA into a suitable expression vector, transfection of 
eukaryotic cells with a plasmid vector or a combination of plasmid vectors, each 
encoding one or more distinct genes or with linear DNA. and selection of transfected 
cells are well known in the an (see, e.g. Sambrook et al. (1989) Molecular Cloning: A 
Laboratory Manual. Second Edition, Cold Spring Harbor Laboratory Press). 

Transfected or transformed cells are cultured using media and culturing methods known 
in the art, preferably under conditions, whereby the nucleic acid binding protein 
encoded by the DNA is expressed. The composiuon of suitable media is known to those 
in the art, so that they can be readily prepared. Suitable culturing media are also 
commercially available. 

The zinc finger polypeptides according to the present invention are varied at at least one 
position. Preferably, the positions selected for variation is one of the positions 
identified above as being important in determining the binding specificity of the zinc 
finger polypeptide of the invention. 

By "vary" (including grammatical modifications) it is intended to denote that a 
particular amino acid in the molecule is replaced with an amino acid selected from a 
varied group, to produce a repertoire of homologous zinc finger polypeptides which 
differ at the particular amino acid position. The variant amino acids may be selected 
from a small group of two or more amino acids, from a larger group or may be 
completely randomly selected from all 20 naturally occurring amino acids. In a 
preferred embodiment, ammo acid analogues and artificial ammo acids may be 
employed. 
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Variation of the zinc finger binding motifs produced according to the invention is 
preferably directed to those residues where the code provided herein gives a choice of 
residues. For example, therefore, positions +1. +5 and +8 are advantageously 
randomised, whilst preferably avoiding hydrophobic amino acids; positions involved in 
binding to the nucleic acid, notably -1, +2, +3 and +6. may be randomised also, 
preferably within the choices provided by the rules of the present invention. 

Preferably, therefore, the "default" protein produced according to the rules provided by 
the invention can be improved by subjecting the protein to one or more rounds of 
variation and selection within the specified parameters. 

Mutagenesis of zinc finger polypeptides may be achieved by any suitable means. 
Preferably, the mutagenesis is performed at the nucleic acid level, for example by 
synthesising novel genes encoding mutant proteins and expressing these to obtain a 
variety of different proteins. Alternatively, existing genes can be themselves mutated, 
such by site-directed or random mutagenesis, in order to obtain the desired mutant 



genes. 



Mutations may be performed by any method known to those of skill in the art. 
Preferred, however, is site-directed mutagenesis of a nucleic acid sequence encoding 
the protein of interest. A number of methods for site-directed mutagenesis are known 
in the art. from methods employ mg single-stranded phage such as M13 to PCR-based 
techniques (see TCR Protocols: A guide to methods and applications", M.A. Inrus, 
D.H. Gelfand. J.J. Sninsky. T.J. White (eds.). Academic Press. New York. 1990). 
Preferably, the commercially available Altered Site II Mutagenesis System (Promega) 
may be employed, according to the directions given by the manufacturer. 

Selection of varied polypeptides according to the invention is carried out by polysome 
display (see Table 1). This technique relies on coupled transcription and translation of 
the codme seauences encoding the zinc finger polypeptides of the invention This is 
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from the nbosome. such that the whole entity can be isolated as a polysome 
Polysomes are then selected by binding the polypeptide to target nucleic acid, and 
mRNA eluted from those polysomes which display the desired binding characteristics. 
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Tarjei Site 



The Polysome Display Procedure 
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Polysome display may be performed according to the methods known in the an, as 
described below. For example, reference is made to W095/11922. the methods of 
which are incorporated herein by reference. The methods of W095/11922 may be 
adapted to the present invention, as follows: 

Improved Methods For Screening Nascent Peptide Libraries 

A polysome library displaying nascent zinc finger polypeptides can be generated 
by a variety of methods. Generally, an in vitro translation system is employed to 

r,r, nnivsomcs from a Dooulation of added mRNA species. Often, the in vitro 

translation system used is a conventional eukaryonc translation system (e.g., rabbit 
reticulocyte lysate. wheat germ extract). However, an E. coli S30 system (Promega, 
Madison, Wisconsin) can be used to generate the polysome library from a population of 
added mRNA species or by coupled transcription/ translation (infra). Suitable E. Coh 
S30 systems may be produced by conventional methods or may be obtained from 
commercial sources (Promega, Madison, Wisconsin). The E. coii S30 translation 
system is generally more efficient at producing polysomes suitable for affinity screening 
of displayed nascent peptides, and the like. Moreover, a prokaryotic translation 
system, such as the E. coli S30 system, has the further advantage that a variety of drugs 
which block prokaryotic translation (e.g., inhibitors of ribosome function), such as 
nfampicin or chloramphenicol, can be added at a suitable concentration and/or 
timepoint to stall translation and produce a population of stalled polysomes, suitable for 
affinity screening against a target nmucleic acid sequence. 

In general, the method comprises the steps of: (1) introducing a population of mRNA 
species into a prokaryotic in vitro translation system (e.g., E. coli S30) under 
conditions suitable for translation to form a pool of polysomes displaying nascent zinc 
finger polypeptides, so-called polysome forming conditions; (2) contacting the 
polysomes with a target nucleic acid under suitable binding conditions (i.e., for specific 
binding to the target nucleic acid and for preserving intact polysome structure); (3) 
selecting polysomes which are specifically bound to the nucleic acid (e.g., by removing 
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sequences of the selected polysomes (e.g.. by synthesizing cDNA or reverse 
transcriptase PCR amplification product, and sequencing said cDNA or amplification 
product). Often, the nucleic acid used for screening is immobilized, such as by being 
bound to a solid support. 

In a variation of the method, the population of mRNA molecules is introduced into the 
in vitro translation system by de novo synthesis of the mRNA from a DNA template. 
In this improvement, a population of DNA templates capable of being transcribed in 
o. havinr an ODerablv linked T7 or SP6 or other suitable promoter) are 
introduced into a coupled in vitro transcription/translation system (e.g.. an E. coli S30 
system) under conditions suitable for in vitro transcription and translation of the 
transcribed product. Generally, using a coupled in vitro transcription/ translation 
system is highly efficient for producing polysomes displaying nascent zinc finger 
polypeptides suitable for affinity screening. Of course, and as noted above, uncoupled 
systems may also be used. i.e.. by adding mRNA to an in vitro translation extract. 

A further improvement to the general methods of screening nascent zinc finger 
polypeptide-displaying polysomes comprises the additional step of adding a preblocking 
agent (e.g.. nonfat milk, serum albumin. tRNA. and/or gelatin) prior to or concomitant 
with the step of contacting the nascent peptide-displaymg polysomes with an 
immobilized nucleic acid. The additional step of adding a preblocking agent reduces 
the amount of polysomes which bind nonspecifically to the target nucleic acid and/or to 
the immobilization surface (e.g.. microtitre well), thereby enhancing the specificity of 
selection for polysomes displaying peptides mat specifically bind to the nucleic acid. 
Although the preblocking agent can be selected from a broad group of suitable 
compositions, the group of preblocking agents comprising: nonfat milk/nonfat milk 
solids, casein, bovine serum albumin, transfer RNA. and gelatin are preferred, with 
nonfat milk being especially preferable Other suitable preblocking agents can be used. 

Preblockins asenis that do not substantially interfere with specific binding (i.e.. non- 
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A further improvement to the general methods of screening nascent peptide-d.splaymg 
polysomes comprises the additional step of isolating polysomes from an in vitro 
translation reaction (or a coupled in vitro transcription/translation reaction) prior to the 
5 step of contacting the nascent pepcide-displaying polysomes with nucleic acid. 
Generally, the polysomes are isolated from a translation reaction by high speed 
centntugat.on to pellet the polysomes, so that the polysome pellet is recovered and the 
supernatant containing contaminants is discarded. The polysome pellet is resolubilised 
: : „w. cninrinn m reta.n intact polysomes. The resolubilised polysomes may be 

111 CI juikui/iv ^ — — -- 

.0 recentnfuged at lower speed (i.e.. wh.ch does not pellet polysomes) so that the 
.nsoluble contaminants pellet and are discarded and the supernatant containing soluble 
polysomes is recovered, and the supernatant used for affinity screening. Alternatively, 
the resolubilised polysomes may be used for affinity screening directly (i.e., without 
low speed cemrifugation). Furthermore, the order of centrifugation may be reversed, 

15 so that low speed centrifugation is performed prior to high speed centrifugation; the low 
speed centrifugation supernatant is then centrifuged at high speed and the pelleted 
polysomes are resolubilised and used for affinity screening. Multiple rounds of high 
speed and/or low speed centrifugation may be used to increasingly purify the polysomes 
prior to contacting the polysomes with the immobilized nucleic acid. 



20 



Another improvement to the general methods of affinity screening of nascent peptide- 
displaying polysomes comprises adding a non-ionic detergent to the binding and/or 
wash buffers. Non-.onic detergent (e.g., Triton X-100. NP-40. Tween. etc.) is added 
in the binding buffer (i.e.. the aqueous solution present during the step of contacting the 
25 polysomes with the immobilized nucleic acid) and/or the wash buffer (i.e. . the aqueous 
solution used to wash the bound polysomes (i.e.. bound to the immobilized nucleic 
acid). Generally, the non-ionic detergent is added to a final concentration of about 
between 0.01 to 0.5 % (v/v). with 0.1% being typical. 



i0 



Another improvement to the general methods of affinity screening of nascent peptide 
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transcribed) in vitro without cloning the library in host cells. Cloning libraries in host 
cells frequently diminishes the diversity of the library and may skew the distribution of 
the relative abundance of library members. In vitro library construction generally 
comprises ligatmg each member of a population of polynucleotides encoding library 
members to a. polynucleotide sequence comprising a promoter suitable for in vitro 
transcription (e.g.. T7 promoter and leader). The resultant population of DNA 
templates may optionally be purified by gel electrophoresis. The population of DNA 
templates is then transcribed and translated in vitro, such as by a coupled 
udfiscriptioR/translatiop. system (e g. T E. coli S30). 



A further improvement to the general methods of affinity screening comprises the added 
step of combining affinity screening of a nascent pepude-displaying polysome, library 
with screening of a bacteriophage peptide display library (or other, i.e.. peptides on 
plasmids. expression as secreted soluble antibody in host cells, in vitro expression). In 

15 this improvement, polysomes are isolated by affinity screening of a nascent peptide- 
display library. The isolated polysomes are dissociated, and cDNA is made from the 
mRNA sequences that encoded nascent peptides that specifically bound to the target 
nucleic acid). The cDNA sequences encoding the nascent peptide binding regions (i.e., 
the portions which formed binding contacts to the nucleic acid(s); variable segment 

20 sequences) are cloned into a suitable bacteriophage peptide display vector (e.g., pAFF6 
or other suitable vector). The resultant bacteriophage vectors are introduced into a host 
cell to produce a library of bacteriophage particles. Each of the phage clones express 
on their virion surface the polysome derived peptide sequences as fusions to a coat 
protein (e.g., as an N-terrmnal fusion to the PHI coat protein). By incorporating the in 

25 vitro-enriched peptide sequences from the polysome screening into a bacteriophage 
display system, it is possible to continue affinity selection for additional rounds. It is 
also advantageous, because the resultant bacteriophage display libraries can be screened 
and tested under conditions that might not have been appropriate for the intact 
polysomes. 
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Another improvement to the methods of affinity screening is the control of display 
valency (i.e., the average number of functional zinc finger polypeptides displayed per 
polysome, and the capacity to vary display valency in different rounds of affinity 
screening. Typically, a high display valency permits many binding contacts between 
5 the polysome and nucleic acid, thus affording stable binding for polysomes which 
encode zinc finger polypeptide species which have relatively weak binding. Hence, a 
high display valency system allows screening to identify a broader diversity range of 
zinc finger polypeptides, since even lower affinity zinc finger polypeptides can be 
selected. Frequently, such low-to-medium affinity zinc finger polypeptides can be 
10 superior candidates for generating very high affinity zinc finger polypeptides, by 
selecting high affinity zinc finger polypeptides from a pool of mutagenised low-to- 
medium affinity zinc finger polypeptide clones. Thus, affinity sharpening by 
mutagenesis and subsequent rounds of affinity selection can be used in conjunction with 
a broader pool of initially selected zinc finger polypeptide sequences if a high display 
15 valency method is used. Alternate rounds of high display valency screening and low 
display valency screening can be performed, in any order, starting from cither a high or 
low valency system, for as many affinity screening rounds as desired, with intervening 
variation and sequence diversity broadening, if desired. Alternate rounds of affinity 
screening, wherein a first round consists of screening a zinc finger polypeptide library 
20 expressed in a high valency display system, selecting zinc finger polypeptide clones 
which bind the target nucleic acid, optionally conducting a mutagenesis step to expand 
the sequence variability of the selected zinc finger polypeptides, expressing the selected 
zinc finger polypeptide clones in a lower valency display system, and selecting clones 
which bind the tarhet nucleic acid, can be performed, including various permutations 
25 and combinations of multiple screening cycles, wherein each cycle can be of a similar 
or different display valency. This improvement affords an overall screening program 
that employs systems which are compatible with switchable valency (i.e., one screening 
cvcle can have a different display valency than the other(s), and can alternate in order). 

30 Display valency can be controlled by a variety of methods, including but not limited to: 
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system. This can be controlled by any suitable method, including: (1) altering the 
length of the encoding mRNA sequence to reduce or .ncrease the frequency of 
translation termination (a longer mRNA will typically display more nascent peptides per 
polysome than a shorter mRNA encoding sequence). (2) incorporating stalling (i.e.. 
infrequently used) codons in the encoding mRNA. typically distal (downstream of) of 
the zinc finger polypeptide-encoding ponion(s), (3) incorporating RNA secondary 
structure-forming sequences (e.g.. hairpin, cruciform, etc.) distal to the zinc finger 
polypeptide-encoding portion and proximal to (upstream to) the translation termination 
cir* if anv and/or (4) including an antisense polynucleotide (e.g., DNA. RNA. 
polyamide nucle.c acid) that hybridizes to the mRNA distal to the zinc finger 
polypeptide-encoding portion and proximal to (and possibly spanning) the translation 
termination site, if any. The length of the mRNA may be increased to increase display 
valency, such as by adding additional reading frame sequences downstream of the zinc 
finger polypeptide-encoding sequences; such additional reading frame sequences can. 
for example, encode the sequence (-AAVP-) n . where n is typically at least I, frequently 
at least 5 to 10. often at least 15 to 25. and may be at least 50-100, up to approximately 
150 to 500 or more, although infrequently a longer stall sequence can be used. Stalling 
codons (i.e., codons which are slowly translated relative to other codons in a given 
translation system) can be determined empirically for any translation system, such as by 
measuring translation efficiency of mRNA templates which differ only in the presence 
or relative abundance of particular codons. For example, a set of clones can be 
evaluated in the chosen translation system; each species or the set has a stalling 
polypeptide sequence of 25 ammo acids, but each stalling polypeptide sequence consists 
of a repeating series of one codon. such that all translatable codons are represented in 
the set. When translated under equivalent conditions, the zinc finger polypeptide 
species which produce polysomes having the highest valency (e.g., as determined by 
sedimentation rate, buoyancy, electron microscopic examination, and other diagnostic 
methods) thereby identify stalling codons as the codon(s) in the stalling polypeptide 



sequence 
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In one embodiment, a stalling polypeptide sequence is distal (3' to) the zinc finger 
polypeptide-encoding sequence, and comprises -(Gly-Gly-Gly-Gly-Serh-A-A-V-P-. or 
repeats thereof. 

Alternatively, or in combination with the noted variations, the valency of the target 
nucleic acid may be varied to control the average binding affinity of selected library 
members. The target nucleic acid can be bound to a surface or substrate at varying 
densities, such as by including a competitor tarhet nucleic acid, by dilution, or by other 
method known to those in the art. A high density (valency) of target nucleic acid can 
be used to enrich for library members which have relatively low affinity, whereas a low 
density i valency) can preferentially enrich for higher affinity library members. 

Each of the improvements to the methods of affinity screening may be combined with 
other compatible improvements. For example, an in vitro transcription/translation 
system can be used in conjunction with a library of DNA templates synthesized in vitro 
(i.e. without cloning in a host cell). The resultant polysomes can be purified by one or 
more rounds of high-speed and/or low-speed centrifugation. The purified polysomes 
can be contacted with an immobilized nucleic acid that is preblocked (e.g., with nonfat 
milk), and a . non-ionic detergent may also be present to further reduce nonspecific 
binding. The selected polysomes may then be used as templates for synthesizing cDNA 
which is then cloned into a bacteriophage display vector, such that the variable 
segments of the nascent peptides are now displayed on bacteriophage. 



Amplification, Affinity Enrichment. And Screening 

A basic method is described for synthesizing a nascent peptide-polysome library in 
vitro, screening and enrichment of the library for species having desired specific 
binding properties, and recovery of the nucleotide sequences that encode those peptides 
of sufficient binding affinity for target nucleic acid sufficient for selection by affinity 
selection. 
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The library consists of a population of nascent zinc finger polypeptide library members 
comprising nascent peptides. After selecting those nascent peptide library members that 
bind to the nucleic acid with high affinity, the selected complexes arc disrupted and the 
mRNA is recovered and amplified to create DNA copies of the message. Typically 
each copy comprises an operably linked in vitro transcription promoter (e.g.. T7 or 
SP6 promoter). The DNA copies are transcribed in vitro to produce mRNA. and the 
process is repeated to enrich for zinc finger polypeptides that bind with sufficient 
affinity. 

The following general steps are frequently followed m the method: (1) generate a DNA 
template which is suitable for in vitro synthesis of mRNA. (2) synthesize mRNA in 
vitro by transcription of the DNA templates in a coupled transcriptionytranslation 
system, (3) bind the nascent peptide to a preferably immobilised target nucleic acid, (4) 
recover and amplify nascent peptide library members which bind the target nucleic acid 
and produce DNA templates from the selected library members competent for in vitro 
transcription. 

Each generated DNA template preferably contains a promoter (e.g., T7 or SP6) which 
is active in an in vitro transcription system. A DNA template generally comprises (1) a 
promoter which is functional for m vitro transcription and operably linked to (2) a 
polynucleotide sequence encoding an mRNA penod. Said encoded mRNA comprises a 
polynucleotide sequence which encodes a polypeptide comprising a zinc finger 
polypeptide. (2) a polynucleotide sequence to which a DNA primer suitable for priming 
first-strand cDNA synthesis of the mRNA can bind, and (3) a nbosome-binding site and 
other elements necessary for in vitro trans 1 at ability of the mRNA, and optionally, for 
mRNA stability and translatable secondary structure, if any. 

Following translation, polysome complexes are screened for fugh-affinity nucleic acid- 
binding using standard procedures and as described herein. 
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After selecting those nascent peptide/polynucleotide complexes that bind with sufficient 
affinity, the polysomes are isolated and ribosomes released by the addition of EDTA 
sufficient to chelate the Mg+2 present in the buffer. Ribosomes are removed by high- 
speed centrifugation. and the RNA component is released by phenol extraction, or by 
changing the ionic strength, temperature or pH of the binding buffer so as to denature 
the nascent peptide. A cDNA copy of the mRNA is made using reverse transcriptase, 
and the cDNA copy is amplified by. the polymerase chain reaction (PCR). The 
amplified cDNA is added to the in vitro transcription system and the process is repeated 
rn .^rirh fnr rhose Denudes that bind with high affinity. 



The invention is further described, for the purposes of illustration only, in the following 
examples. 

Example 1 

Preparation Of A Varied Zinc Finger Polypeptide 

Zinc finger polypeptides incorporating variation at selected positions are 
constructed in accordance with the preceding instructions, or as described in any one of 
5 GB9710805.4, GB9710806.2. GB9710807.0, GB9710808.8, GB97108O9.6. 
GB9710810.4, GB9710811.2, GB9710812.0 or EP95928576.8, which are incorporated 
herein by reference. 

Example 2 

0 T}<e Template Construct 
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The construct is similar to that of Mattheakis */ al (1994) Proc Natl Acad Sci 
USA, 91, 9022-9026, but with some modifications to increase the efficiency of 
ribosome stalling. 

General Structure 

The general structure of a transcription template suitable for selection of zinc 
finger polypeptides according to the present invention is shown in Table 2. 



WO 00/27878 



PCT/CB99/03730 



33 



Table 2 



r rTrn ,l Smir.mr » »f 7inr Finger Tnnsrrimion Unit 



RBS 



Ci n ff >r r^TI iLinker / -Stalling Sequence"! 



u l / riuiiivjtwi 



Gly/Ser 
linker 



Easily 
translated 
region 



Rare 

codons 



The unit contains a bacteriophage T7 RNA polymerase promoter, which drives a 
coding sequence encoding a zinc finger polypeptide. Appended to the coding sequence 
is a linker/stalling sequence region which comprises a flexible Gly/Ser linker, an easily 
translatable region and a stalling region which is composed of codons rare in E. coU. 
Rare codons hold up the translation process and effectively stall the ribosome on the 
template. 

Sequence Information 
77 Promoter 

This is the standard bacteriophage T7 RNA polymerase promoter., having the 
sequence TA ATA CGA CTA ACT ATA GGG AGA 
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Rtbosome Binding Sue 

This is the bacteriophage T7. gene 10 nbosome binding site. (This and the 77 
promoter give high efficiency initiation of transcription and translation). It has the 
sequence AAGGAG. 

5 

Zinc Finger Gene 

A zinc finger coding sequence as shown in SEQ. ID. No. 1 is used. 

Linker/Stalling Sequence 
10 The first 3/4 of this sequence is virtually the same as that used by Mattheakis et 

al (1994). This is because it is the principles behind the design which are important, 
and not the sequence itself. 

First there is a 31 residue serine-glycine repeat. This serves as a flexible linker, when 
15 translated, which ensures that the expressed zinc finger construct has spacial separation, 
and flexibility with respect to the stalled ribosomc 

Second there is a series of six Ala-Ala-Val-Pro repeats. This is a standard, relatively 
easily translated sequence and serves to ensure that the ribosome is stalled after (and 
20 not before) the entire flexible (Ser-Gly) linker has emerged from the nbosome. This is 
relatively important since approximately ten amino acids are covered by the ribosome at 
any one time. 

Third comes a short stretch of codons which contain a high proportion of rare codons - 
25 with respect to £. cott usage, which slow the translational process and cause regular 
pauses. 

Fourth, there is added towards the end of the "third region" an additional ribosome 
stalling sequence. This has been discussed in Gu et al (1994) Proc Natl Acad Sci USA, 

oi ^^-^16 and Lovett & Roeers (1996) Microbiological Reviews. 60, 366-385. 
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The sequence: 

M V K T D K 
ATG GTT AAA ACA GAT AAA 

when translated, interacts wiih the peptidyi transferase site of the £. coli nbosome, 
causing transiauonal pausing. In the presence of chloramphenicol, this paused state 
becomes a stalled state. 

The seauence is found at the beginning of the cat-86 gene in E. coli which gl ves 
resistance to this antibiotic. 

It has been found that this sequence increases the efficiency of ribosome stalling by 
between 10% and 20%, when compared to the exact sequence used by Mattheakis ei al 
(1994). 

Finally, there is no translation^ STOP codon, so nbosomes will pause when they reach 
the end of the RNA transcript, before dissociating. 

Example 3 

The Procedure 

The template used is produced by PCR and so is linear, double -stranded DNA 
of approximately 670bps. 

Transcription is earned out in a coupled transcription and translation system for linear 
DNA templates. (The £. coli 530 extract system for linear DNA - Promega.) 

At present, transcription/ translation reactions are earned out in 50\i\ volumes, each 
primed with approximately one pmole template - (400-500ng, up to 10 12 DNA 
molecules). 



WO 00/27878 



PCT/GB99/03730 



15 



36 

The extract system does not contain T7 RNA polymerase, so this is supplemented by 
adding T7 polymerase enzyme, and endogenous E. coli RNA polymerase ,s inhibited by 

adding rifampicin. 



5 Template (0.5pM/nD 
Rifampicin (50ng/nl 
BZA (lOOmM) 
ZnCl 2 (20mM) 

10 Amino acid mix 
530 Premix 
530 Extract 
RNasIN 



T7 RNA polymerase 
H,0 



2 Hi 
1 

Vi 



20 
15 

V2 

1 (alOOOU) 
3 3 A 



50 \x\ 



inhibits E. coli RNA polymerase 
inhibits proteases 

500 \lM final concentration, for zinc 
fineer folding 



inhibits RNases 



20 



25 



incubate 25 °C/30 minutes. 

add three volumes of ice cold stalling buffer. 

place on ice/ 15 minutes. 

Incubation is earned out at 25°C. to help inhibit proteases and RNases, but can be done 
at any temperature up to 37 °C. 

The ribosomes axe stalled by adding to the in vitro synthesis system three volumes of 
"stalling/poly some buffer". 



Stalling/Poly some Buffer 

ThU jc made a' P'i x concentrate but us 1 x working concentrations are: 
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Tns (pH7.4) 

KCl 

MgCl : 

DTT 

ZnCl 2 

Chloramphenicol 



20mM 
50mM 
lOmM 
5mM 

50uM 
18ug/ml 



The actual ingredients, as far as ribosorne stalling is concerned, are Mg 2+ and 

.u, u;^ Ma :+ rnnrrnrration (i.e. 10-20mM) greatly increases the 

10 affinity of ribosomes for mRNA (Hotschuh & Gassen, (1982) J Biol Chem, 257, 1987- 
1992) and chloramphenicol causes ribosomes to stall, particularly strongly when 
combined with the MVKTDK sequence. Tests show that up to 50% of mRNA is 
attached to ribosomes after stalling cf 40% when using the construct of Mattheakis et 
al, 1994. 

To collect "polysomes": 

spin 90,000 rpm/30 minutes/4°C (all steps from here to the RT-PCR are earned 

out in the cold room). 

resuspend polysome pellet in 200(il of stalling buffer. 
20 - incubate on rolling mixer for 20 minutes/4°C. 

spin down 13,000 rpm/5 minutes ( to pellet insoluble material), 
spin down 13,000 rpm/5 minutes ( to pellet insoluble material), 
collect supernatant. 



25 Example 4 

Affinity Selection Of Polysomes 

The target ds DNA binding sue is pre-bound to streptavidin coated wells to give 

approximately ImM concentration once polysomes are added. 



^ To ?00ul polvsomes suspension, add lue polv dH-C) competitor DNA 
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immediately add to binding sue coated * 

incubate 30 minutes. 

wash 6 times with "washing buffer". 



Washing Buffer 

This is 1 x -.Ming buffer", with the add.uon of 0.1% between 20 and twtce 
of KC1 and MgCU. to help remove non-specifically bound proteins. 



the concentration 



wasil L. uiuw wiui i. ^ J"**""a 

add lOOuJ of elution buffer. 

incubate 30 minutes with gentle and occasional agitation. 



Elution Buffer 

This is the same as stalling buffer, but without chloramphenicol or MgCl,. and 
w.th the addition of 20mM EDTA. The EDTA chelates Mg 1 * ions and dissociates 



ribosomes. 



Removal OfDNA Contamination 

To ensure that the next round of selection is not contaminated by template from the 
previous round, all ds DNA is removed by incubation with DNasel. 



To 100^1 eluted mRNA: 

add 4^ 1M MgCU (since DNasel requires Mg 2 + ) 

add 2U DNasel 

incubate 37°C/15 minutes 

phenol extract 

ethanol precipitate mRNA 

resuspend in 20^1 H ; 0. 
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Reverse Transcnpuon 

Reverse transcription is then used to create a ss DNA template from the mRNA 

collected. 
PCR 

The ss DNA is then amplified by PCR to give a double-stranded, full-length 
template which can be used in the next round of selection experiments. 
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Claims 

1. A method for producing a zinc finger nucleic acid binding protein comprising 
preparing a zinc finger protein according design rules, varying the protein at one or more 

5 positions, and selecting variants which bind to a target nucleic acid sequence by polysome 
display. 

2. A method according to claim 1, wherein the protein is varied at one or more 
positions selected from the group consisting of + 1, +5, +8,-1, +2, +3 and +6. 

10 

3. A method for producing a zinc finger nucleic acid binding protein comprising an 
at least partially varied sequence and selecting variants thereof which bind to a target 
DNA strand, comprising the steps of: 

1 5 (i) preparing a nucleic acid binding protein of the Cys2-His2 zinc finger class capable 
of binding to a nucleic acid triplet in a target nucleic acid sequence, wherein binding to 
each base of the triplet by an cc-helical zinc finger nucleic acid binding motif in the 
protein is determined as follows: 

20 a) if the 5' base in the triplet is G, then position +6 in the a-helix is Arg; or position +6 is 
Ser or Thr and position -H-2 is Asp; 

b) if the 5' base in the triplet is A, then position +6 in the a-helix is Gin and ++2 is not 
Asp; 

c) if the 5' base in the triplet is T, then position +6 in the a-helix is Ser or Thr and 
25 position -h-2 is Asp; 

d) if the 5' base in the triplet is C, then position +6 in the a-helix may be any amino acid, 
provided that position ++2 in the a-helix is not Asp; 

e) if the central base in the triplet is G, then position +3 in the a-helix is His; 

f) if the central base in the triplet is A, then position +3 in the a-helix is Asn; 

30 g) if the central base in the triplet is T, then position +3 in the a-helix is Ala, Ser or Val; 
provided that if it is Ala, then one of the residues at -1 or +6 is a small residue; 
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h) if the central base in the triplet is C, then position +3 in the a-helix is Ser, Asp, Glu, 
Leu, Thr or Val; 

i) if the 3' base in the triplet is G, then position - I in the a-helix is Arg; 
j) if the 3 1 base in the triplet is A, then position -1 in the a-helix is Gin; 

5 k) if the 3' base in the triplet is T, then position -1 in the a-helix is Asn or Gin; 
1) if the 3' base in the triplet is C, then position -1 in the a-helix is Asp; 

(ii) varying the resultant polypeptide at at least one position; and 

10 (iii) selecting the variants which bind to a target nucleic acid sequence by polysome 
display. 

4. A method according to any preceding claim, wherein the or each zinc finger has 
the general primary structure 

15 

(A) X* C X 2 . 4 C X 2 -3 FX c XXXXLXXHXXX b H - linker 

-1 12345 6-789 

wherein X (including X a , X b and X c ) is any amino acid. 

20 

5. A method according to claim 5 wherein X* is / Y -XorP- fy-X, 

6. A method according to claim 4 or claim 5 wherein X2-4 is selected from any one 
of: S-X, E-X, K-X, T-X, P-X and R-X. 

25 

7. A method according to any one of claims 4 to 6 wherein X b is T or I. 

8. A method according to any one of claims 4 to 7 wherein X2-3 is G-K-A, G-K-C, G- 
K-S, G-K-G, M-R-N or M-R. 

30 

9. A method according to any one of claims 4 to 8 wherein the linker is T-G-E-K or 
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10. A method according to any one of claims 4 to 9 wherein position +9 is R or K. 

11. A method according to any one of claims 4 to 10 wherein positions +1, +5 and +8 
5 are not occupied by any one of the hydrophobic amino acids, F, W or Y. 

12. A method according to claim 1 1 wherein positions +1, +5 and +8 are occupied by 
the residues K, T and Q respectively. 

10 13. A method according to any preceding claim, wherein the polysome display 
technique comprises the steps of: 

(a) introducing a population of mRNA species into an in vitro translation system 
under conditions suitable for translation to form a pool of polysomes displaying nascent 

1 5 zinc finger polypeptides; 

(b) contacting the polysomes with a target nucleic acid under suitable binding 

conditions; 

(c) selecting polysomes which are specifically bound to the nucleic acid; and 

(d) reverse transcribing and amplifying the isolated mRNA. 

20 



SUBSTITUTE SHEET ( rule 26 ) 



WO 00/27878 

SEQUENCE LISTING 



PCT/GB99/03730 



<110> Gendaq Limited 

<120> Screening System 

<130> p3755 

<140> 
<141> 

<160> 2 

<170> Patentln Ver. 2.1 

<210> 1 
<2U> 264 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic DNA 

<220> 
<221> CDS 
<222> (1) . . (264) 

<400> 1 

gca gaa gag aag cct 

Ala Glu Glu Lys Pro 

1 5 

gat cgt agt agt ctt 
Asp Arg Ser Ser Leu 

20 

cct ttt cag tgt cga 
Pro Phe Gin Cys Arg 
35 

ctt acg aga cac eta 
Leu Thr Arg His Leu 
50 

cga ate tgc atg cgt 
Arg lie Cys Met Arc 



ttt cag tgt 

Phe Gin Cys 



acc cgc cac 
Thr Arg His 



ate tgc atg 
lie Cys Met 
40 

agg acc cac 
Arg Thr His 
55 

aac ttc agg 
Asn Phe Arg 



cga ate tgc 
Arg lie Cys 

10 

acg agg acc 
Thr Arg Thr 
25 

cgt aac ttc 
Arg Asn Phe 

aca ggc gag 
Thr Gly Glu 

caa get gat 
Gin Ala Asp 



atg cgt aac 
Met Arg Asn 

cac aca ggc 
His Thr Gly 
30 

age agg age 
Ser Arg Ser 
45 

aag cct ttt 
Lys Pro Phe 
60 

cat ctt caa 
His Leu Gin 



ttc age 48 
Phe Ser 
15 

gag aag 96 
Glu Lys 

gat aac 144 
Asp Asn 

cag tgt 192 
Gin Cys 

gag cac 240 
Glu His 
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eta aag acc cac aca ggc gag aag 
Leu Lys Thr His Thr Gly Glu Lys 

85 
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<210> 2 

<211> 88 

<212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence : Synthetic DNA 



<400> 2 

Ala Glu Glu Lys Pro 
1 5 

Asp Arg Ser Ser Leu 

20 

Pro Phe Gin Cys Arg 
35 

Leu Thr Arg His Leu 
50 

Arg He Cys Met Arg 
65 



Phe Gin Cys Arg He Cys 

10 

Thr Arg His Thr Arg Thr 

25 

He Cys Met Arg Asn Phe 
40 

Arg Thr His Thr Gly Glu 
55 

Asn Phe Arg Gin Ala Asp 
70 75 



Met Arg Asn Phe Ser 

15 

His Thr Gly Glu Lys 
30 

Ser Arg Ser Asp Asn 
45 

Lys Pro Phe Gin Cys 
60 

His Leu Gin Glu His 

80 



Leu Lys Thr His Thr Gly Glu Lys 

85 
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